What is claimed is: 



1 . A processor comprising: 

a processing core to execute a trace having one or more lines of one or more 
micro-operations; and 

an optimizer to optimize the trace upon each execution of the trace by the 
processing core. 

2. The processor of claim 1, wherein the optimizer is a pipehned optimizer. 

3. The processor of claim 1, further comprising a trace cache to store a trace from 
said optimizer. 

4. The processor of claim 3, further comprising: 

an instruction cache to store static code received from a compiler via a 
memory; 

a mite to translate the static code into micro-operations; and 
a fill buffer to build a trace from the micro-operations. 

5. The processor of claim 4, further comprising a trace queue to store one or more 

lines of one or more traces from the fill buffer and one or more lines from one or more 

traces from the trace cache. 
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6. The processor of claim 5, further comprising an allocator to send traces from 
the trace queue to the processing core and the optimizer. 

7. The processor of claim 1, wherein the processing core is an out of order 
processing core. 

8. The processor of claim 1, wherein the optimizer is to track optimizations 
executed on a specific trace. 

9. The processor of claim 1 , wherein the optimizer is to pack the trace after 
optimization. 

1 0. The processor of claim 9, wherein the optimizer is to pack the trace by 
optimizing two consecutive lines of a trace simultaneously. 

1 1 . The processor of claim 10, wherein the optimizer is to use an altemating offset 
to determine the two consecutive lines of the trace to optimize together. 

12. The processor of claim 1, wherein optimizations includes at least one of a 
group of optimizations consisting of call return elimination, dead code elimination, 
dynamic |iop fusion, binding, load balancing, move elimination, common sub- 
expression elimination, constant propagation, redundant load elimination, store 
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forwarding, memory renaming, trace specialization, value specialization, reassociation, 
and branch promotion. 

13. The processor of claim 1, wherein the optimizer executes optimizations based 
on runtime information collected during execution of the trace. 

14. The processor of claim 13, wherein the runtime information is appended to the 
trace in the trace cache. 

15. The processor of claim 13, further comprising a runtime information buffer to 
store the runtime information, the runtime information buffer mapped to the trace 
cache to match the runtime information with the trace. 

16. An optimization unit comprising: 

an input to receive a trace each time the trace is sent to a processing core; and 
an optimizer to optimize the trace, 

1 7. The optimizing unit of claim 16, wherein the optimizer is a pipelined 
optimizer. 

18. The optimizing xmit of claim 16, further comprising an output connected to a 

trace cache to store an optimized trace after optimization by the optimizer. 
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19. The optimizing unit of claim 16, wherein the input is connected to an 
allocator, the allocator to send traces from a trace queue storing optimized and 
unoptimized traces to the processing core and the optimizer. 

20. The optimizing unit of claim 16, wherem the optimizer tracks optimizations 
executed on a specific trace. 

21. The optimizing imit of claim 16, wherein the optimizer packs the trace after 
optimization. 

22. The optimizing unit of claim 21, wherein the optimizer packs the trace by 
optimizing two or more consecutive lines of a trace simultaneously. 

23. The optimizing unit of claim 22, wherein the optimizer uses an alternating 
offset to determine the two or more consecutive lines of the trace to optimize. 

24. The optimizing unit of claim 16, wherein optimizations includes at least one of 
a group of optimizations consisting of call return elimination, dead code elimination, 
dynamic jiop fusion, binding, load balancing, move elimination, common sub- 
expression elimination, constant propagation, redundant load elimination, store 
forwarding, memory renaming, trace specialization, value specialization, reassociation, 
and branch promotion. 
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25. The optimizing unit of claim 16, wherein the optimizer executes optimizations 
based on runtime information collected during execution of the trace. 

26 . A method comprising: 

executing a trace in a processing core; and 

simultaneously optimizing the trace each time the trace is executed. 

27. The method of claim 26, further including storing the trace after optimization 
in a trace cache. 

28. The method of claim 27, further including storing unoptimized traces to be 
processed and optimized. 

29. The method of claim 28, further comprising: 
storing static code from a compiler; 

translating the static code into micro-operations; and 
building an unoptimized trace from the micro-operations. 

30. The method of claim 26, wherein the processing core is an out of order 
processing core. 
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3 1 . The method of claim 26, further including tracking optimizations executed on a 
specific trace. 

32. The method of claim 26, further including packing the trace after optimization. 

33. The method of claim 32, wherein the trace is packed by optimizing two or 
more consecutive lines of a trace simultaneously. 

34. The method of claim 33, further including using an alternating offset to 
determine the two or more consecutive lines of the trace to optimize. 

35. The method of claim 26, wherein optimizing includes at least one of a group of 
optimizations consisting of call return elimination, dead code elimination, dynamic 
|iop fusion, binding, load balancing, move elimination, common sub-expression 
elimination, constant propagation, redundant load elimination, store forwarding, 
memory renaming, trace speciaUzation, value specialization, reassociation, and branch 
promotion. 

36. The method of claim 26, further including optimizing based on runtime 
information collected during execution of the trace. 
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37. The method of claim 36, further including appending the runtime information 
to the trace. 

38. A system comprising: 

a memory to store a trace; 

a processor coupled to said memory to execute a trace in a processing core and 
to simultaneously optimize the trace each time the trace is executed. 

39. The system of claim 38, wherein the processor has an out of order processing 
core. 

40. The system of claim 38, wherein the processor tracks optimizations executed 
on a specific trace. 

41 . The system of claim 38, wherein the processor packs the trace after 
optimization. 

42. The system of claim 41 , wherein the trace is packed by optimizing two or more 
consecutive lines of a trace simultaneously. 

43. The system of claim 42, wherein an alternating offset is used to determine the 

two or more consecutive lines of the trace to optimize. 
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44. The system of claim 38, wherein optimizing includes at least one of a group of 
optimizations consisting of call return elimination, dead code elimination, dynamic 
flop fusion, binding, load balancing, move elimination, common sub-expression 
elimination, constant propagation, redundant load elimination, store forwarding, 
memory renaming, trace specialization, value specialization, reassociation, and branch 
promotion. 

45. The system of claim 38, wherein the trace is optimized based on runtime 
information collected during execution. 

46. A set of instructions residing in a storage medium, said set of instructions capable of 
being executed by a processor to implement a method for processing data, the method 
comprising: 

executing a trace in a processing core; and 

simultaneously optimizing the trace each time the trace is executed. 

47. The set of instructions of claim 46, further including tracking optunizations 
executed on a specific trace. 

48. The set of instructions of claim 46, further including packing the trace after 
optimization. 
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49. The set of instructions of claim 48, wherein the trace is packed by optimizing 
two or more consecutive lines of a trace simultaneously. 

50. The set of instructions of claim 49, further including using an alternating offset 
to determine the two or more consecutive lines of the trace to optimize. 

5 1 . The set of instructions of claim 46, wherein optimizing includes at least one of 
a group of optimizations consisting of call return elimination, dead code elimination, 
dynamic )aop fusion, binding, load balancing, move elimination, common sub- 
expression elimination, constant propagation, redundant load elimination, store 
forwarding, memory renaming, trace specialization, value speciaUzation, reassociation, 
and branch promotion. 

52. The set of instructions of claim 46, further including optimizing based on 
runtime information collected during execution of the trace. 

53. The set of instructions of claim 52, further including appending the runtime 
information to the trace. 
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